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The nucleotide sequence of the peplomer ( E2 ) gene of MHV-A59 was determined from a set of overlapping cDNA 
clones. The E2 gene encodes a protein of 1324 amino acids including a hydrophobic signal peptide. A second large 
hydrophobic domain is found near the COOH terminus and probably represents the membrane anchor. Twenty 
glycosylation sites are predicted. Cleavage of the E2 protein results in two different 90K species, 90A and 90B (L. S. 

Sturman, C. S. Ricard, and K. V. Holmes (1985)7. Virol. 56, 904-911), and activates cell fusion. Protein sequencing of 
the trypsin-generated N-terminus revealed the position of the cleavage site. 90A and 90B could be identified as the 
C-terminal and the N-terminal parts, respectively. Amino acid sequence comparison of the A59 and JHM E2 proteins 
showed extensive homology and revealed a stretch of 89 amino acids in the 90B region of the A59 E2 protein that is 
absent in JHM. © 1987 Academic Press, Inc. 


INTRODUCTION 

Murine hepatitis viruses (MHV) are coronaviruses 
which cause a variety of diseases including hepatitis 
and encephalomyelitis in the natural host (Wege etal., 
1982). They are studied extensively, since MHV is a 
useful animal model for virus-induced demyelination 
and because coronaviruses possess a unique mode of 
replication (Siddell etal., 1983). 

The infectious genome of MHV consists of a single- 
stranded RNA of about 20 kb which is associated with 
a single protein species with a mol wt of 54K in a 
helical nucleocapsid. Two membrane-associated pro¬ 
teins are present in the virions: the large glycoprotein 
E2, forming the characteristic surface projections or 
peplomers, and the smaller membrane glycoprotein E1 
(26.5K) (Armstrong et at., 1984a). The peplomer pro¬ 
tein, encoded by mRNA 3 (Rottier etal., 1981), is syn¬ 
thesized on ribosomes bound to the rough endoplas¬ 
mic reticulum (RER) where it is cotranslationally glyco¬ 
sylated (Sturman and Holmes, 1983; Holmes etal., 
1984) and subsequently acylated, probably during 
transport through the Golgi apparatus (Nieman and 
Klenk, 1981; Sturman et at., 1985). MHV virions bud 
from the RER and Golgi membranes and are appar¬ 
ently transported to the exterior by the internal secre¬ 
tory apparatus. Two forms of the E2 protein are 
present on the surface of the virion, the 180K and the 

' To whom requests for reprints should be addressed. 


90K species (Sturman and Holmes, 1977). Recently it 
has been shown that the 90K protein consists of two 
different species, 90A and 90B, arising from proteo¬ 
lytic cleavage of the 180K protein (Sturman et al., 
1985). This cleavage activates cell fusion, and the ratio 
of 180K to 90K proteins is host dependent (Sturman et 
al., 1985; Frana et ai, 1985). It has been suggested 
that such host-dependent differences in the process¬ 
ing of E2 may be important for cytopathic effects, viru¬ 
lence, and tissue tropism of the murine coronaviruses 
(Frana et al., 1985). 

The peplomer protein is involved in cell attachment 
(Collins et ai, 1982) and is the target for neutralizing 
antibodies (Fleming etal., 1983). E2 plays an important 
role in the pathology of MHV. Buchmeier et al. (1984) 
showed that in MHV-JHM-infected mice passive 
transfer of neutralizing monoclonal antibodies, recog¬ 
nizing E2, prevented fatal infection by wild-type virus. 
Instead, a chronic demyelinating disease developed. 
These changed pathogenic properties seem to be a 
result of mutations in E2 (Dalziel etal., 1986; Fleming 
et al., 1986). To understand the biological and patho¬ 
genic properties of MHV at the molecular level the 
primary structure of E2 and data on its processing are 
essential. Here we report the cDNA cloning and se¬ 
quence analysis of the gene encoding the E2 protein 
of MHV-A59. By direct amino acid sequence analysis 
of the N-terminal part of the 90A species we were also 
able to identify the trypsin cleavage site. E2 is the main 
structural protein determining strain differences be- 


479 


0042-6822/87 $3.00 

Copyright © 1987 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



480 


LUYTJES ET AL. 


cause it shows significant antigenic polymorphism, in 
contrast to the other structural proteins (Talbot and 
Buchmeier, 1985). It will probably reflect the major 
differences between related coronavirus species. To 
localize these differences we have compared the E2 
gene sequence of the MHV-A59 strain with that of 
strain JHM published recently (Schmidt et al., 1987). 

MATERIALS AND METHODS 

Identification of the trypsin cleavage site by amino 
acid sequence analysis 

Virus purification was carried out with modification 
as described by Sturman era/. (1980, 1985). Purified 
90A and 90B proteins were prepared from trypsin- 
treated virions and the 180K E2 protein was prepared 
from untreated virus. Following incubation with tryp¬ 
sin, 10 jitg/ml, in TMEN, pH 6.5 (50 m/W Tris-maleate, 
1 m M EDTA, 100 m M NaCI), at 37° for 30 min, soy¬ 
bean trypsin inhibitor, 50 Mg/ml, was added for 30 min 
at 4°. Virus was then sedimented at 24,000 rpm in an 
SW 28 rotor at 4° for 2.5 hr. E2 was extracted with 
Triton X-114, and 90A and 90B were separated as 
described previously by HPLC on HPHT (hydroxyapa¬ 
tite) columns in sodium dodecyl sulfate (SDS) (Ricard 
and Sturman, 1985). Uncleaved (180K) E2 was sepa¬ 
rated from 90K species by HPLC size exclusion chro¬ 
matography with a Bio-Sil TSK (Bio-Rad) guard column 
and Bio-Sil TSK 400, 7.5 X 300 mm, and Spherogel 
TSK 4000 (Altex), 7.5 x 300 mm, columns connected 
in series. SDS was removed from purified proteins by 
ion pair extraction with acetone:triethylamine:acetic 
acid:water, 85:5:5:5 (Henderson et al., 1979). Proteins 
were washed with trifluoroacetic acid, lyophylized, and 
dissolved in trifluoracetic acid. The amino terminal se¬ 
quence was determined by automated Edman degra¬ 
dation using an Applied Biosystems gas phase se¬ 
quencer. Phenylthiohydantoyn (PTH) amino acids 
were identified by HPLC. 

cDNA synthesis and cloning 

Viral genomic RNA and poly(A)-containing intracel¬ 
lular RNAs were isolated from purified MHV-A59 vir¬ 
ions and infected cells, respectively (Spaan et al., 
1981). Procedures for synthesis of cDNA were essen¬ 
tially identical to those described by Dowling (1983) 
and Gubler and Hoffman (1983). For the synthesis of 
the single-stranded cDNA, pentanu,cleotides and spe¬ 
cific primers were used. Full details will be presented 
elsewhere (P. J. Bredenbeek etai, manuscript in prep¬ 
aration). 

After homopolymer tailing of the double-stranded 
cDNA (Peacock, 1981) or digestion with restriction en¬ 


donucleases, the cDNA was annealed to dG-tailed 
pUC9 DNA (Pharmacia) or ligated to pEMBL DNA 
(Dente etai, 1983), respectively. Transformation was 
carried out by adding the annealed or ligated DNA to 
Escherichia coti strain JM101 or JM109 competent 
cells (Messing, 1983), prepared by the method de¬ 
scribed by Hanahan (1983), which were subsequently 
plated on petri dishes containing 25 jtg/ml ampicillin. 

Screening and analysis of recombinants 

Plasmid DNA from ampicillin-resistant colonies ob¬ 
tained after transformation of the ligated restriction 
fragments was prepared according to the method de¬ 
scribed by Birnboim and Doly (1979). The mapping of 
the cDNA clones on the genome will be described in 
detail elsewhere (P. J. Bredenbeek etai., manuscript in 
preparation). 

Formaldehyde-agarose gel analysis and 
hybridization 

Poly(A)-containing RNA from MHV-infected cells 
was denatured in the presence of formaldehyde and 
separated in an 1.5% agarose-formaldehyde gel 
(Lehrach et al., 1977). After electrophoresis the gel 
was dried on Whatmann 3 MM paper and subse¬ 
quently incubated with a kinase-labeled oligonucleo¬ 
tide probe according to Meinkoth and Wahl (1984). 
Hybridization and washing temperature was 5-10° 
below the calculated T d . 

Oligonucleotide synthesis 

Oligonucleotides were prepared as described pre¬ 
viously (Niesters et al., 1986) or were synthesized 
using a DNA-synthesizer, Biosearch Model 8600, and 
subsequently purified by HPLC. 

DNA sequence analysis 

DNA fragments were prepared by digestion with a 
variety of restriction enzymes and ligated either as a 
mixture or as single fragments purified from agarose 
gels into the Ml3 vectors mp8 and mp9 (Messing, 
1983). White plaques were screened for viral inserts 
using pentamer-primed probes from cDNA clones 
(Feinburg and Vogelstein, 1983; Roberts and Wilson, 
1985). Single-stranded Ml3 DNA was isolated and 
used for sequence analysis using the dideoxynucleo- 
tide chain termination procedure of Sanger et al. 
(1977). Sequence data were assembled and analyzed 
using the computer programs created by Staden 
(1986). 
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Protein sequence homology searches 

The predicted amino acid sequence was compared 
to other sequences and to the NBRF Protein Bank 
using the FASTP program set of Lipman and Pearson 
(1985) and the DIAGON program of Staden (1982). 

RESULTS 

cDNA cloning, mapping of recombinant plasmids, 
and sequence analysis 

When we started this sequence study, a number of 
cDNA clones against cellular RNAs of MFIV-A59 was 
already available. Mapping by hybridization to the viral 
mRNAs(P. J. Bredenbeekefa/., manuscript in prepara¬ 
tion) and sequence analysis indicated that the over¬ 
lapping clones 95, 918, and 85 were positioned 
around the 5' end of the E2 gene (Fig. 1). Clone 853 


was mapped beyond the 3' end in gene D. Oligonu¬ 
cleotide 7 (OL 7) complementary to a sequence in the 
3' end of clone 85 and oligonucleotide 8 (OL 8) based 
upon the sequence of clone 853 were synthesized and 
used to screen the new random genomic cDNA li¬ 
brary. Several positive recombinant DNA clones were 
isolated and characterized by restriction site mapping. 
This permitted construction of a continuous map of 
approximately 5 kb containing the complete unique 
region of mRNA 3 encoding the E2 protein. 

The large insert of clone B24 was isolated and sub¬ 
sequently digested with restriction endonuclease 
HpaW or Taq\. The complete digests were ligated into 
Ml3 mp9. Initial selection of subclones overlapping 
the consensus sequence of clones 95, 918, and 85 
was performed by hybridizing a probe from clone B60 
to phage DNA. The sequence strategy is summarized 
in Fig. 1. 
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Fig. 1. Cloning and sequencing strategy of the MHV-A59 E2 gene and hydrophobicity pattern of the predicted amino acid sequence. Open 
boxes represent open reading frames in the coding regions of RNA 2 <B), RNA 3 (E2), and RNA 4(D) transposed on the genome. Vertical bars 
indicate homology regions in the intergenic sequences of MHV-A59. Numbers represent the nucleotide distance to the start of the 3’ poly(A) tail. 
The vertical arrow points at the trypsin cleavage site. The double arrow marks the region in MHV-A59 absent in strain JHM. Small boxes 
represent the synthetic oligonucleotides used in cloning, sequencing, and hybridizations, numbered when referred to in the text. cDNA clones 
are indicated by horizontal lines. Extent and direction of sequencing is shown by means of the arrows below. Symbols indicating restriction sites 
are explained in the figure. The hydrophobicity pattern was generated using the HYDROPLOT program created by Staden (1986), modified with 
hydrophobicity data from Eisenberg et at. (1982). Above the line is hydrophobic. 






MHV-JHM: AMINO ACID SEQUENCE (DIFFERENCES FROM A59) L 

MHV-A59: PREDICTED AMINO ACID SEQUENCE _ M l, FVF ILFLPSCLGY IGDFR 20 

MHV-A59: NUCLEOTIDE SEQUENCE CODING REGION RNA 3- n’AATCTAAACj VICCTCTrCGTGriTATTCTATTrn'GCCCTCTTG'nTAGGGTATATTGCTGAT'HTAGA 60 

TYNA ADR* 

ciqlvnsngaQOvsapsistetvevsqu'l gtyyvldrvyl[n] 60 
TGTATCCAGCTrGTGAATrCAAACGGTGCTAATGTTAGTGCTCCAAGCATTAGCACTGAGACCG'rrGAAGITfCACAAGGCCTGGGGACATATTATGTG'ITAGATCGAG'nTA'nTAAAr 180 

NY TL TKFSE 

ATLLLTGYYPVDGSKFRNLA l. TGTNSVSLSWFQPPYI. NQF 100 

CCCACATTATTGCmCTGGmcTACCCCGTCGATGGTrCTAAGrmCAAACCTUICTCTIAUGGAACTAACTCAGTrAGCITGTaiTGGITrCAACCACCCTAmAAATCAGITr 300 

NTS N 

NDGIFAKVQNLKTSTPSGATAYFPTIV1GSLFGYTSYTVV 140 
AATGATGGCATAnTGCGAAGGTGCAGAACCrrAAGACAAGTACGCCATCAGCl'TGCAACTGCATA'rlTrCCTACTATAGITATAGCTAGTI’rGrrTGGCTATACTrCCTATACCG'rrCTA 420 

L N I T P R V 

1EPYNGVIMASVCQYT1CQLP YTDCKPNTNGNKI, IGFWIIT 180 

ATAGAGCCATATAATGGTGTTATAATGGCCTCAGTGTGCCAGTATACCATTrGTCAGTlACC'ITACACTGA'rrGTAAGCCTAACACTAA'I'GGTAATAAGCTrATAGGGTnTGGCACACG 540 

L F P W 1, Q 

dvkppicvlkr[n]ftlnvnadafyfhfyqiiggtfyayyadk 220 

GATGTAAAACCCCCAATTTGTGTGTTAAAGCGAAATnCACGCTTAATGTTAATCCTGATGCATnTArrrTCArmTACCAACATGGTGGTACTTTJTATGCGTACTATGCGGATAAA 660 

F T 1. L 

PSATTFLFSVYIGDILTQYY VLPF ICNPTAGSTFAPRYWV 260 

CCCTCCGCTACTACGTmTGTTTAGTGTATATATTGGCGATATTTTAACACAGTATTATGTGTrACCrrTCATCTGCAACCCAACAGCTGGTAGCAC'nTTGCTCCGCGCTATTGGGTT 780 

L E 1 M 

TPLVKRQYLFNFNQKGVITSAVDCASSYTSEIKCKTQSML 300 
ACACCTTTGGTTAAGCGCCAATATTTGTTTAATTTCAACCAGAAGGGTGTCATTACTAGTGCTG'ri’CATTGTGCTAGTAG'lTATAGCAGTUAAATAAAATGTAAGACCCAGAGCATUTTA 900 

D P D K K 

PSTGVYELSGYTVQPVGVVYRRVANl. PACN1EF. W I. TARSV 340 

CCTAGCACTGGTGTCTATGAGTTATCCGGTTATACGGTCCAACCAGnGGAGrrGTATACUNXYlTinTGCTAACCTCCCAGCTl'GTAATATAGAGGAGTGGLTrACTGCTAGGTCAGTC 1020 


PSPLNWERKTFQNCNF0LSS I. LRYVQAESI. FCNNIDASKV 380 
CCCTCCCCTCTCAACTGGGAGCGTAAGACTnTCAGAATrGTAATnTAA'rn'AAGCAGCCrGTrACGTrATGTrCAGGCTGAGAGTTrGlTrrGTAATAATATCGATGCTTCCAAAGTG 1140 

MV I II 

YGRCFGSISVDKFAVPRSRQVDLQLGNSGFLQTANYK1D T 420 

TATGGCAGGTGCTTTGGTAGTATn'CAGTTCATAAGTTTGCTGTACCCCGAAGTAGGCAAGriGArlTACAGCTTGGTAACTCTGGATTTCTGCAGACTGCI'AAlTATAAGATTGATACA 1260 

Y S Y K * * * * * “I * 

aatscqlhytlpkn(n]vtinniinpsswnrrygfniiagvfgk 460 

GCTGCCACTTCGTGTCAGCTGCATTACACCTTGCCTAAGAATAATGTCACCATAAACAACCATAACCCCTCGTClTGGAATAGGAGGTATGGCTTrAATGATGCTGGCGTmTGGCAAA 1380 

*********************««*****#**#****$*$# 
NQHDVVYAQQCFTVRSSYCPCAQPDIVSPCTTQTKPKSAF 500 
AACCAACATGACGTTGnTACGCTCAGCAATGTTTTACTGTAAGATCTAGTTATTGCCCGTGTCCTCAACCGGACATAGTTAGCCCTTGCACTACTCAGACTAAGCCTAAGTCTGCTTIT 1500 

4$*«**4****#***#*#** ««***#>*** * # * * * * # * s:t sit # 

VNVGDHCEGLGVLEDNCGNADP1IKGCICa[n]nSFIGWSIID T 540 

GTrAATGTGGGTGACCATTGTGAAGGClTAGGTGTTITAGAAGATAATl'GTGGCAATGCrGATCCACATAAGGG'lTCTATCTGTGCCAACAAlTCATrrAT’l'GGATGGTCAl A IGA I ACC 1620 


CLVNDRCQ I F ANILLNGINSG T TCS TDLQLPNTEVVTGIC 580 
TGCCTTGTrAATGATCGCTGCCAAATTTTTGCTAATATATTGTTAAATGGCATTAATAGTGGTACCACATGTTCCACAGATTTGCAGTTGCCTAATACTGAAGTGGTTACTGGCATTTGT 1740 

R A 

VKYDLYGITGQGVFKEVKADYYNSWQTLLYDVNGNLNGFR 620 
GTCAAATATCACCTCTACGGTATTACTGGACAAGGTGTTTTTAAAGAQCTTAAGGCTGACTATrATAATAGCTGGCAAACCCTTCTGTATGATGTTAATGCTAAriTCAATGGTrn'CUT 1860 

.—. YE 

dlttQDktytirscysgrvsa afhkdapepallyrni[n]csy 660 
GATCTTACCACTAACAAGACTTATACGATAAGGAGCTGTTATAGTGGCCGTGnTCTGCrGCATrrCATAAAGATGCACCCGAACCGGCTCTGLTCTATCGTAATATAAATrCTAGCTAT 1980 

T „ N 

VFSN[N]lSREENPLNYFDSYLGCVVNAD0RTDEALPNCDLR 700 
GTTTTTAGCAATAATATTTCCCGTGAGGAGAACCCACrrAATTACTTTGATAGlTA'nTGGGrrCTGITGTrAATGCTCATAACCCCACGGATGAGGCGClTCCTAA'ITGTGATCTGCGT 2100 

R M 

MGAGLCVDYSKSRRAH R^ S VSTCYRLTTFEPYTPML V [n] U S V 740 

ATGGGTGCTGGCrrATGCGITGA™TTCAAAATCACGCAGGGCTCACCGATCAGWCTA^J l c!GcTrrCGGrPAAC , TAd:AW(!A(56?!lTA(5A^Tci(!!ATc)WA(JrrAAT(jATAC , rGiC 2220 

G l A 

QSVDGLYEMQIPT0FT1GI1HEEF1QTRSPKVT1DCAAFVC 780 
CAATCCG'rrGATGGATTATATGAGATGCAAATACCAACCAATTTTACTA'n'GGGCACCATGAGGAGTTCATTCAAACTAGATCTCCAAAGGTGACTATACATrGTCCTCCATTTCTCTCT 2340 

A D 

CDNTACRQQLVEYGSFCVNVNAILNEVNNLLDNMQLQVAS 820 
GGTGATAACACTGCATGCAGGCAGCAGTTGGTTGAGTATGGCTCTrTCTGTGTTAATGTTAATGCCAITClTAATGAGGITAATAACCTCTTGGATAATATGCAACTACAAGrrGCTAGT 2460 


ALMQGVTISSRLPDGISGPI DDI[n]fSPLLGCIGSTCAEDG 860 
GCATTAATGCAGGGTGTTACTATAAGCTCGAGACTGCCAGACGGCATCTCAGGCCCTATAGATGACATTAATTTTAGTCCTCTACTTGGATGCATAGGTTCAACATGTGCTGAAGACGGC 2580 


NGPSAIRGRSAIEDLLFDKVKLSDVGFVEAYN0CTGGQEV 900 
AATGGACCTAGTGCAATCCGAGGGCGTTCTGCTATAGAGGATTTGTTATTTGACAAGGT; AAATTATCTGATGTTGGCTTTGTCGAGGCTTATAATAATTGCACCGGTGGTCAAGAAGTT 2700 

Fig. 2. Nucleotide and predicted amino acid sequence of the MHV-A59 E2 gene. Numbering starts at the ATG codon (arrow) at position 
-7403 from the poiy(A) tail. Dots mark the N-terminal signal sequence and the C-terminal membrane anchor. The trypsin cleavage generated 
N-terminal amino acid sequence of 90A as analyzed by Edman degradation is underlined. The cleavage site between 90B and 90A is indicated 
by an arrowhead. Potential glycosylation sites are indicated by boxed asparagine residues. The MHV-JHM amino acid sequence (Schmidt etal., 
1987) is printed where differences with MHV-A59 occur. Asterisks represent deletions. Intergenic homology regions are boxed. 
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RDLLCVQSFNGI KVLPPVLS ESQI SGYTTGAT AAAMFPPW 940 

CGTGACCTCCTTTGTGTACAATCTTTTAATGGCATCAAAGTATTACCTCCTGTGTTGTCAGAGAGTCAGATCTCTGGCTACACAACCGGTGCTACTGCGGCAGCTATGTTCCCACCGTGG 2820 

T N 

S A A A G V P F S L S V Q Y R I N G L G V T M N V L S E N Q K M I A S A F N N A 980 

TCAGCAGCTGCCGGTGTGCCATTTAGTTTAAGTGTTCAATATAGAATOATGGTTTAGGTGTCACTATGAATGTGCTTAGTGAGAACCAAAAGATGATTGCTAGTGCTnTAACAATGCG 2940 

E 

LGAIQDGFDATMSALGKIQSVVNANAEALNNLLNQLSNRF 1020 
CTGGGTGCTATCCAGGATGGGTTTGATGCAACCAA'n’CTGCTTTAGGTAAGATCCAGTCrGTTG'n'AATGCAAATGCTGAAGCACTGAATAAGTTACTAAATCAGCTTTCTAACAGGTTT 3060 

D 

GAISASLQEILTRLEAVEAKAQ1DRLINGRL TALNAYISK 1060 
GGTGCTATTAGTGCTTCmACAAGAAATTCTAACTCGGCTTGAGGCTGTAGAAGCAAAAGCCCAGATAGATCGTCTTATTAATGGCAGGTTAACTGCACTTAATGCGTATATATCCAAG 3180 

F 

QLSDSTLIKVSAAQAIEKVNECVKSQTTRINFCGNGNHIL1100 
CAACTTAGTGATAGTACGCTTATTAAAGTTAGTGCTGCTCAGGCCATAGAAAAGGTCAATGAGTGCGTTAAGAGCCAAACCACGCGTATTAATTTCTGTGGCAATGGTAATCATATATTA 3300 


SLVQNAPYGLYFIHFSYVPISFTTA[T]VSPGLCISGDRGLA 1140 
TCTCITGTCCAGAATGCGCCTTATGGCTTATATTTTATACACTTCAGCTATGTGCCAATA TCCTTTACAACCGCAAATGTGAGTCCTGGACTTTGCA'nTCTGGTGATAGAGGATTAGCA 3420 

N N A I 

PKAGYFVQDDGEWKFTGSSYYYPEPITDKNSVIMSSCAV[N]ll80 
CCTAAAGCTGGATATTTTGTTCAAGATGATGGAGAATGGAAGTTCACAGGCAGTTCATATTACTACCCTGAACCCATTACAGATAAAAACAGTGTCATTATGAGTAGTTGCGCAGTAAAC 3540 

N L 

ytkapevfl[n]tsipnppdfkeeldkwfk[n]qtsiapdlsld 1220 

TACACAAAGGCACCTGAAGTTTTCTTGAACACTTCAATACCTAATCCACCCGACTTTAAGGAGGAGTTAGATAAATGGTTTAAGAATCAGACGTCTA1TGCGCCTGATTTATCTCTCGAT 3660 
F 

FEKL[n]vTLLDLTYEMNRIQDAIKKL0ESYINLKEVGTYEM 1260 
TTCGAGAAGTTAAATGTTACTTTGCTGGACCTGACGTATGAGATGAACAGGATTCAGGATGCAATTAAGAAGTTAAATGAGAGCTACATCAACCTCAAGGAAGTTGGCACATATGAAATG 3780 

R 

YVKWPWYVWLLIGLAGVAVCVLLFFICCCTGCGSCCFKKC1300 
TATGTGAAATGGCCTTGGTATGTTTGGTTGCTAATTGGATTAGCTGGTGTAGCTGTTTGTGTGTTGTTATTCTTTATATGTTGCTGCACAGGTTGTGGCTCATGTTGTTTTAAGAAGTGT 3900 



CNCCDEYGGHQDSIVIH0IS SHED* 1324 

GGAAArTGTTGTGATGAGTATGGAGGACACCAGGACAGTATTGTGATACATAATAnTCCTCTCATGAGGATTGACTATCACAGCCTCTCCTGGAAAGACAGA kAAY^tAAA^l 4073 


Fig. 2— Continued. 


Nucleotide and amino acid sequence 

The consensus nucleotide sequence shows an 
open reading frame (ORF) of 3972 nucleotides 
stretching from position -7403 to -3429 from the 
poly(A) tail. The initiation codon lies immediately adja¬ 
cent to a short sequence which is fully compatible with 
the intergenic homology sequence 5'-(A/T)AATC(T/ 
QAAAC-3' (Bredenbeek et at., 1987). A similar se¬ 
quence is found 28 nucleotides downstream from the 
end of the ORF (Fig. 2). There were no alternative 
ORFs longer than 60 amino acids in the unique region 
of mRNA 3. The large ORF is therefore identified as the 
coding sequence of the E2 protein. 

The ORF encodes a protein of 1324 amino acids 
with some typical features. The N-terminal region (Fig. 
2) contains a stretch of amino acids consistent with a 
signal sequence (Von Fleyne, 1986). Another region of 
high hydrophobicity is found at the C-terminus and 
probably represents a membrane anchor. In the hy¬ 
drophobicity plot this region appears as a strong sym¬ 
metrical peak (Fig. 1). It starts with a series of nonpolar 
amino acids spanning the membrane and ends with a 
cluster of cysteine residues; it is followed by a number 
of charged residues which are probably located at the 
interior of the virion. 


The ORF potentially codes for an apoprotein with a 
mol wt of 146K, which is in the range reported by 
several authors (see Siddell et at., 1983; Repp et at., 
1985). Based on the assumption that Asn-X-Thr and 
Asn-X-Ser (X not being Pro) signals can be glycosyla¬ 
ted and assuming that the extreme C-terminal site is 
located in the interior and thus unlikely to be used, we 
could identify 20 potential sites for N-glycosylation 
(Neuberger et at., 1972). These are enough to add the 
extra 35K needed to reach the M r of 180K required for 
the E2 protein. Acylation of E2 has been reported 
(Sturman et at., 1985) but little is known about acyla¬ 
tion signals; we could therefore not determine its con¬ 
tribution to the weight of the protein. 

Identification of the trypsin cleavage site 

The two 90K cleavage products, designated 90A 
and 90B, can be separated by SDS-hydroxyapatite 
chromatography (Ricard and Sturman, 1985). The lo¬ 
cation of the trypsin cleavage site and the relationship 
of 90A and 90B to the uncleaved protein was deter¬ 
mined by comparison of the amino terminal sequence 
identified by Edman degradation with the sequence 
deduced from analysis of cDNA. 
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The amino terminal sequence of 90A, Ser-Val- 
Ser-Thr-Gly-Tyr-Arg-Leu-Thr-Thr-Phe-Glu-Pro- 
Tyr-Thr-Pro-Met-Leu, is identical to the sequence 
underlined in Fig. 2. The trypsin cleavage site can thus 
be positioned between residues 717 and 718 in the 
amino acid sequence. The 90B and 180K species ap¬ 
pear to possess blocked amino termini as no definitive 
amino terminal sequences could be determined. 

The identification of a signal sequence and of a 
membrane anchor, the determination of the amino ter¬ 
minal sequence of 90A, and the finding that 90A but 
not 90B is acylated (Sturman et at., 1985) allows us to 
conclude that the structure of E2 is NH 2 -90B-90A- 
COOH. The cleavage products 90A and 90B have 
ORFs with lengths of 606 and 717 amino acids, re¬ 
spectively, corresponding with coding capacities for 
apoproteins of 66K and 79K. 

Comparison of the peplomer protein sequences of 
MFIV strains A59 and JHM 

Considerable polymorphism has been seen on the 
E2 glycoprotein of coronaviruses (Talbot and Buch- 
meier, 1985). To localize the differences we have 
compared the predicted amino acid sequences of the 
E2 protein of MFIV strains A59 and JFHM (Fig. 2). 

The two proteins are highly conserved: there is an 
overall homology of 93% and 90A is more conserved 
than 90B (96 and 89%, respectively). Flowever, there 
is a remarkable difference: starting at amino acid (aa) 
454 we find a stretch of 89 aa (267 nucleotides) that is 
not present in the E2 sequence of JFHM. To rule out the 
possibility that this additional sequence is the product 
of cDNA cloning artifacts, we isolated and sequenced 
several independent cDNA clones covering the region 
(G10, All, and B60, Fig. 1). They all contained the 
additional sequence. We then synthesized an oligonu¬ 
cleotide (OL 53, Fig. 1) complementary to nucleotide 
position 1423 to 1442 in the A59 sequence and hy¬ 
bridized it to MFIV-A59 poly(A)-selected messengers 
separated by electrophoresis. It is clear from Fig. 3 
that the A59 “insertion'' is an actual genomic feature, 
as it is found in mRNAs 3, 2, and 1. The extra bands in 
the gel can not be accounted for but have been found 
with other MFIV probes (data not shown) and possibly 
represent leaderless RNAs. 

The fact that the sequences of both strains can be 
perfectly aligned (when we exclude the additional se¬ 
quence) allows a nucleotide to nucleotide comparison 
and the creation of a mutation table (Table 1). The 
sequences of the genes coding for the nucleocapsid 
(N) and the matrix (El) protein were included because 
their products show little antigenic variation (Talbot 
and Buchmeier, 1985) and may thus be used as refer¬ 
ences. The ratio of nonsilent to silent (N/S) mutations 


A B 



Fig. 3. Hybridization of a synthetic oligonucleotide from the 
MHV-A59 E2 region absent in strain JHM to the MHV-A59 messen¬ 
ger RNAs. Lane A, hybridization with OL 53 (Fig. 1) specific for the 
additional sequence of MHV-A59. Lane B, hybridization with an 
oligonucleotide complementary to part of the A59 leader sequence 
(Spaan et at., 1984). RNAs are numbered according to Spaan et at. 
(1981). 

can be interpreted as an indication of mutation selec¬ 
tion. In random mutated sequences, when no selec¬ 
tion mechanism is involved, this ratio will be about 3. 
Lower ratios will reflect selection against mutation 
whereas higher values indicate positive selection. For 
functional genes, however, this ratio ranges from 0.2 
to 1.7, since many mutations will be lethal and there¬ 
fore not found (Flewett-Emmett etal., 1982). The ratios 
N/S for the coronaviral proteins are indeed in this 
range (Table 1). When we consider the/V and £/genes 
as less susceptible to selective pressure we can un¬ 
derstand the lower ratio found for the 90A species— 
about half of that of the other proteins (Table 1)—as an 
indication of a negative selection, i.e., supression of 
amino acid mutations. 

DISCUSSION 

The unique region of MFIV-A59 mRNA 3 contains 
the information for the viral peplomer protein E2 (Rot- 
tier et at., 1981). The nucleotide and derived amino 
acid sequence of the gene presented in this paper 
allows us to position several functional domains of the 
coronaviral peplomer protein in the sequence. The 
predicted signal sequence at the N-terminus is con¬ 
sistent with the finding that E2 is translated on ribo¬ 
somes bound to the rough endoplasmic reticulum 
(Holmes et ai, 1984). 
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TABLE 1 


Mutational Differences between MHV Strains A59 and JHM 


Gene 

N/S 

Total number of mutated 

Nucleotides (%) 

Amino acids (%) 

90B 

0.46 

253 (13.4) 

67 (10.7) 

90A 

0.26 

122 (6.7) 

25 (4.1) 

El 

0.50 

21 (3.1) 

7(3.1) 

N 

0.53 

98 (7.2) 

29 (6.4) 


Note. Mutations were scored from aligned sequences. The ratio 
nonsilent to silent (N/S) mutations was calculated based upon a 
method described by Nei and Gojobori (1986). The scores for 90B 
are obtained by excluding the stretch in A59 that is absent in JHM. 
90B, the N terminal cleavage product of the peplomer protein; 90A, 
the C-terminal part; El, matrix protein; N, nucleocapsid protein. 
Data are from Armstrong et at. (1984b), Skinner and Siddeli (1983), 
Pfleiderer et at. (1986), and Schmidt et ai (1987). 

The MHV trypsin cleavage site has been determined 
by analyzing the cleavage-generated amino terminus 
and localizing it in the protein sequence. Cleavage of 
E2 by trypsin is required for activation of the cell-fusing 
activity of the coronavirus (Sturman et a/., 1985). 
Cleavage activation of cell fusion is also found with the 
HA and F glycoproteins of myxo- and paramyxoviruses 
where a hydrophobic amino terminus is involved in cell 
fusion (Gething et ai, 1978; Richardson et ai, 1980). 
The amino terminal sequence of 90A shows no homol¬ 
ogy with analogous regions of HA2 and FI of myxo- 
and paramyxoviruses and does not have a similar 
highly hydrophobic character, because it contains two 
charged residues (Arg and Glu). Moreover, there is no 
sequence homology at the amino terminus of the tryp¬ 
sin cleavage site between the spike proteins of MHV 
and infectious bronchitis virus (IBV; Binns et ai, 1985), 
although their positions are similar. Cleavage of E2 by 
thermolysin, which has a specificity different from that 
of trypsin, also activates MHV-induced cell fusion 
(Baker and Sturman, manuscript in preparation). This 
suggests that proteolytic cleavage of E2 may expose a 
functionally important domain that is internal rather 
than adjacent to the cleavage site. The sequence up¬ 
stream of the cleavage site resembles the consensus 
sequences of trypsin cleavage sites of several other 
glycoproteins (Cavanagh et ai, 1986). 

Proteolytic cleavage of E2 appears to be an impor¬ 
tant determinant of MHV pathogenesis. Investigations 
are in progress to identify the host- and strain-depen¬ 
dent differences in the processing of E2. 

At its C-terminus 90A contains the highly hydropho¬ 
bic potential membrane anchor of the peplomer pro¬ 
tein. A feature of this sequence is that it starts with a 
stretch of eight residues: Lys-Trp-Pro-Trp-Tyr-Val- 


Trp-Lys which appears to be identical in coronavir- 
uses MHV-A59, MHV-JHM, IBV-M41 (Niesters et ai, 
1986), IBV-M42 (Binns et ai, 1985), feline infectious 
peritonitis virus (FIPV; R. J. De Groot et ai, manuscript 
in preparation), and transmissible gastroenteritis virus 
(TGEV; Jacobs et ai, manuscript in preparation). This 
sequence apparently represents a structural signal 
associated with membrane anchoring. Both E2 cleav¬ 
age products in virions have an apparent mol wt of 90K 
as determined by SDS-PAGE (Sturman et ai, 1985) 
but the ORFs of 90A and 90B differ in length and cod¬ 
ing capacity. In comparison in MHV-JHM the E2 cleav¬ 
age products are also of an equal apparent mol wt of 
98K (Siddeli et a!., 1981), yet in this strain the lengths 
of the 90A and 90B ORFs are similar. Even if we take 
into consideration the inaccuracy of electrophoretic 
size estimation due to different SDS binding capacities 
of the cleavage products, we cannot exclude the pos¬ 
sibility of extra or different processing of the A59 cleav¬ 
age products compared to JHM. 

It is not clear whether the additional sequence is 
deleted in JHM in the course of evolution or inserted 
into A59, but it is important to notice that it starts in an 
eight nucleotide stretch 5'-TTAATGAT-3'(Fig. 2) that is 
repeated at the point where the sequences of both 
strains are in step again. This repeat is possibly in¬ 
volved in the creation of the genetic difference be¬ 
tween A59 and JHM. 

Apparently the 90B part of the peplomer protein can 
undergo radical changes without losing its function. 
This is also reflected by the fact that 90B shows the 
highest relative number of mutations. In contrast 
90A is less mutated—-but more important—shows a 
much lower ratio of nonsilent to silent mutations. This 
indicates a selection against sequence changes. De 
Groot et al. (1987) compared the peplomer protein se¬ 
quences of coronaviruses from three different anti¬ 
genic clusters and found that the C-terminal parts 
were conserved whereas the N-terminal parts were 
not. They demonstrated that the C-terminal sequence 
contained sequence patterns that could explain the 
typical elongated form of the coronaviral spike. The 
negative selection in 90A may therefore reflect preser¬ 
vation of structural features. 

The fact that the ratio of nonsilent to silent mutations 
in 90B is comparable to that in the nucleocapsid and 
E1 gene suggests that there is no stronger positive 
selection mechanism—favoring escape mutations— 
in this part of the protein. Talbot and Buchmeier (1985) 
tested a panel of neutralizing monoclonal antibodies to 
MHV-JHM E2 on strain A59 and demonstrated that 
two conformation-dependent antigenic determinants 
were not shared by JHM and A59 whereas a third 
conformation-independent determinant was found on 
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both strains. From our data we suggest that the con- 
formation-dependent epitopes are on the more vari¬ 
able 90B part; the SDS-stable site is probably situated 
on the structurally important and higher conserved 
90A part of the MHV peplomer protein. Experiments 
are in progress to localize these epitopes in the pre¬ 
dicted amino acid sequence. 
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